# Load packages
library(bayesrules)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.4 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 2.0.1 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(rstan)
## Loading required package: StanHeaders
## rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
##
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
##
## extract
library(rstanarm)
## Loading required package: Rcpp
## This is rstanarm version 2.21.1
## - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
## - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
## - For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores())
##
## Attaching package: 'rstanarm'
## The following object is masked from 'package:rstan':
##
## loo
library(bayesplot)
## This is bayesplot version 1.8.1
## - Online documentation and vignettes at mc-stan.org/bayesplot
## - bayesplot theme set to bayesplot::theme_default()
## * Does _not_ affect other ggplot2 plots
## * See ?bayesplot_theme_set for details on theme setting
library(tidybayes)
library(janitor)
##
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
library(broom.mixed)
Since a model line can cross anywhere along the y-axis and the slope of a line can be any positive or negative value (or even 0), the intercept and slope regression parameters, β0 and β1, can technically take any values in the real line. So it’s reasonable to utilize Normal prior models which also live on the entire real line, for β0 and β1.
Since the standard deviation parameter σ must be positive, it’s unreasonable to utilize a Normal prior model which is not restricted to positive values but lives on the entire real line for σ.
While both vague priors and weakly informative priors reflect general prior uncertainty about the model parameters, a vague prior might be so vague that it puts weight on non-sensible parameter values. By contrast, weakly informative priors are relatively more focused for they reflect the general prior uncertainty across a range of sensible parameter values. Weakly informative priors are tuned to identify sensible parameter values by taking into account the scales of the variables of interest, which fosters computationally efficient posterior simulation since the chains don’t have to waste time exploring non-sensible parameter values while vague priors do not.
Here the response variable (Y) is a person’s height, and the predictor variable (X) is a person’s arm length.
Here the response variable (Y) is a person’s carbon footprint (in annual CO2 emissions), and the predictor variable (X) is the distance between a person’s home and work.
Here the response variable (Y) is a child’s vocabulary level, and the predictor variable (X) is a child’s age.
Here the response variable (Y) is a person’s reaction time, and the predictor variable (X) is a person’s sleep habits.
β0 indicates the typical height in cm of a baby kangaroo when it is 0 months (Xi = 0). It provides a baseline for where the prior model lives along the y-axis. β1 indicates the typical change in a baby kangaroo’s height in cm for every one unit (month) increase in age. In this particular model with only one quantitative predictor X, the coefficient β1 is equivalent to a slope. My prior understanding suggests that β1 is positive since normally the height of a baby kangaroo increases with its age (as it grows older).
β0 indicates the typical number of GitHub followers of a data scientist when he/she had 0 GitHub commits in the past week (Xi = 0). It provides a baseline for where the prior model lives along the y-axis. β1 indicates the typical change in a data scientist’s number of GitHub followers for every one unit (count of GitHub commit) increase in his/her GitHub commits in the past week. In this particular model with only one quantitative predictor X, the coefficient β1 is equivalent to a slope. My prior understanding suggests that β1 is positive since more GitHub commits in the past week that a a data scientist made (actively revising his/her work) would probably raise people’s attention and interest in his/her work, thus bringing more GitHub followers.
β0 indicates the typical number of visitors to a local park on a day of 0 rainfall in inches (Xi = 0). It provides a baseline for where the prior model lives along the y-axis. β1 indicates the typical change in the number of visitors to a local park on a given day for every one unit (inch) increase in that day’s rainfall. In this particular model with only one quantitative predictor X, the coefficient β1 is equivalent to a slope. My prior understanding suggests that β1 is negative since normally the number of visitors to a local park on a day decreases with the rainfall on that day (people tend to be reluctant to go out visiting a park on rainy days).
β0 indicates the typical hours of Netflix that a person watches on a daily basis when the typical number of hours that they sleep is 0 (Xi = 0). It provides a baseline for where the prior model lives along the y-axis. β1 indicates the typical change in hours of Netflix that a person watches on a daily basis for every one unit (hour) increase in the typical number of hours that they sleep. In this particular model with only one quantitative predictor X, the coefficient β1 is equivalent to a slope. My prior understanding suggests that β1 is negative since normally the more hours a person typically spent their time on sleeping, the fewer hours they would spend on watching Netflix every day (the habit of binge-watching Netflix may decrease a person’s average time for sleep).
At any value of predictor X, the observed values of Y will vary normally around their average μ with consistent standard deviation σ that indicates the degree to which observations might deviate from the model lines. With a small σ, when the observed data deviate little from and are tightly around the mean model line, the observed outcomes Y differ by a little from the average μ at the same value of X, which indicates X is a strong predictor of Y (similarly, with a larger σ, greater variability will be exhibited in observed outcomes among X of the same value, thus reflecting a weaker relationship between the variables).
The response variable Y is a person’s annual orange juice consumption (in gallons), and the predictor variable X is a person’s age (in years).
Yi|μ, σ ∼ N (μ, σ^2) (Yi denotes a person’s annual orange juice consumption in gallons, μ denotes the global mean orange juice in gallons a person consumes annually across different ages combined, σ measures the variability in a person’s annual orange juice consumption in gallons from the global mean μ across people of all ages)
Yi|β0,β1, σ ∼ N (μi, σ^2 ) with μi= β0+ β1Xi ( Xi denotes a person’s age in years, Yi denotes a person’s annual orange juice consumption in gallons, μi denotes the local mean orange juice in gallons a person consumes annually at a certain age, σ measures variability in a person’s annual orange juice consumption in gallons from the local mean μi of people with similar age)
Unknown parameters in the model are β0, β1, and σ. The intercept coefficient β0 can technically take any values since it indicates a person’s annual consumption of orange juice when the person is 0 years old (Xi = 0). It provides a baseline for where our model “lives” along the y-axis. It can be later centered to reflect a person’s typical annual consumption of orange juice at a typical age if it is far outside the norm, e.g. being negative or huge, for our interpretation. The coefficient β1 can take any value (if we assume there’s a linear relationship between Y and X, a person’s annual consumption of orange juice could increase as their age increases for an adult’s appetite is normally larger than a child’s; or a person’s annual consumption of orange juice could decrease as their age increases for children may have a particular preference in orange juice compared to adults who have a lot more other choices; also there could be no relationship between Y and X). σ can take any positive value since it is the standard deviation parameter and must be positive.
Normal prior models are suitable choices for β0 and β1 (m0, s0, m1, s1 are hyperparameters).
β0∼ N(m0, s0^2)
β1∼ N(m1, s1^2)
Generally the intercept and slope regression parameters, β 0and β1, can technically take any values in the real line, so it’s reasonable to utilize Normal prior models for β0 and β1 , which also live on the entire real line. The normal prior models also apply to this specific case where β0 and β1 can take any non-negative value (a subset of all the real numbers), since Yi is measured on a continuous scale and can be beyond 1; and measurements like consumption volume are often symmetrically distributed around some global average, here μ.
Exponential prior model is a suitable choice for σ.
σ ∼ Exp(l)
The Exponential model is a special case of the Gamma with shape s = 1, Gamma(1, r). Since the standard deviation parameter σ must be positive, it’s reasonable to utilize an Exponential model which is also restricted to positive values.
The predictor variable X is the high temperature (in degrees Fahrenheit) on one day, and the response variable Y is the high temperature (in degrees Fahrenheit) on the next day.
Yi|μ, σ ∼ N (μ, σ^2) (Yi denotes the high temperature in degrees Fahrenheit on the next day, μ denotes the global mean high temperature in degrees Fahrenheit tomorrow across different days combined, σ measures the variability in the high temperature in degrees Fahrenheit on the next day from the global mean μ across all days)
Yi|β0,β1, σ ∼ N (μi, σ^2 ) with μi= β0+ β1Xi ( Xi denotes the high temperature in degrees Fahrenheit on one day, Yi denotes the high temperature in degrees Fahrenheit the next day, μi denotes the local mean high temperature in degrees Fahrenheit on a certain day, σ measures variability in the high temperature in degrees Fahrenheit on the next day from the local mean on next days with similar temperature)
Unknown parameters in the model are β0, β1, and σ. The intercept coefficient β0 can technically take any values since it indicates tomorrow’s high temperature when today’s high temperature is 0 (Xi = 0). It provides a baseline for where our model “lives” along the y-axis. It can be later centered to reflect the typical high temperature on the next day of a day with typical high temperature if it is far outside the norm, e.g. being too low for the region or even negative, for our interpretation. The coefficient β1 can take any positive value (if we assume there’s a linear relationship between Y and X, a day with high temperature normally will be followed by the high temperature on the next day since temperature tends not to drop dramatically in two days). σ can take any positive value since it is the standard deviation parameter and must be positive.
Normal prior models are suitable choices for β0 and β1 (m0, s0, m1, s1 are hyperparameters).
β0∼ N(m0, s0^2)
β1∼ N(m1, s1^2)
Generally the intercept and slope regression parameters, β 0and β1, can technically take any values in the real line, so it’s reasonable to utilize Normal prior models for β0 and β1 , which also live on the entire real line. The normal prior models also apply to this specific case where β0 and β1 can take any positive value (a subset of all the real numbers), since Yi is measured on a continuous scale and can be beyond 1; and measurements like high temperature are often symmetrically distributed around some global average, here μ.
Exponential prior model is a suitable choice for σ.
σ ∼ Exp(l)
The Exponential model is a special case of the Gamma with shape s = 1, Gamma(1, r). Since the standard deviation parameter σ must be positive, it’s reasonable to utilize an Exponential model which is also restricted to positive values.
False
True
bunnies_model <- stan_glm(height ~ age, data = bunnies,
family = gaussian,
prior_intercept = normal(14, 3),
prior = normal(3, 1),
prior_aux = exponential(0.5),
chains = 4, iter = 5000*2, seed = 84735)songs_model <- stan_glm(Clicks ~ Snaps, data = songs,
family = gaussian,
prior_intercept = normal(80, 20),
prior = normal(100, 50),
prior_aux = exponential(0.0025),
chains = 4, iter = 5000*2, seed = 84735)dogs_model <- stan_glm (Happiness ~ Age, data = dogs,
family = gaussian,
prior_intercept = normal(6, 2),
prior = normal(2, 1),
prior_aux = exponential(0.25),
chains = 4, iter = 5000*2, seed = 84735)priors: β0c∼ N(5000, 2000^2)
β1∼ N(-10, 5^2)
σ ∼ Exp(0.0005)
bike_priors <- stan_glm(
rides ~ humidity, data = bikes,
family = gaussian,
prior_intercept = normal(5000, 2000),
prior = normal(-10, 5),
prior_aux = exponential(0.0005),
prior_PD = TRUE,
chains = 5, iter = 4000*2, seed = 2666)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.001812 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 18.12 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 1: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.827417 seconds (Warm-up)
## Chain 1: 0.103766 seconds (Sampling)
## Chain 1: 0.931183 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 2: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.557378 seconds (Warm-up)
## Chain 2: 0.094418 seconds (Sampling)
## Chain 2: 0.651796 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 3: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.435666 seconds (Warm-up)
## Chain 3: 0.142062 seconds (Sampling)
## Chain 3: 0.577728 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 4: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.644031 seconds (Warm-up)
## Chain 4: 0.122083 seconds (Sampling)
## Chain 4: 0.766114 seconds (Total)
## Chain 4:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5:
## Chain 5: Gradient evaluation took 1.4e-05 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5:
## Chain 5:
## Chain 5: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 5: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 5:
## Chain 5: Elapsed Time: 0.590668 seconds (Warm-up)
## Chain 5: 0.127002 seconds (Sampling)
## Chain 5: 0.71767 seconds (Total)
## Chain 5:
# 100 prior model lines
bikes %>%
add_fitted_draws (bike_priors, n = 100) %>%
ggplot(aes(x = humidity, y = rides)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
# 4 prior simulated datasets
set.seed(6)
bikes %>%
add_predicted_draws(bike_priors, n = 4) %>%
ggplot(aes(x = humidity, y = rides)) +
geom_point(aes(y = .prediction, group = .draw)) +
facet_wrap(~ .draw)
## Warning:
## In add_predicted_draws(): The `n` argument is a deprecated alias for `ndraws`.
## Use the `ndraws` argument instead.
## See help("tidybayes-deprecated").
# Load and plot data
data(bikes)
ggplot(bikes, aes(x = humidity, y = rides)) +
geom_point(size = 0.5) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
# Perform a posterior simulation
bike_posterior <- update(bike_priors, prior_PD = FALSE)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000138 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.38 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 1: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.363454 seconds (Warm-up)
## Chain 1: 0.349386 seconds (Sampling)
## Chain 1: 0.71284 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 2: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.422417 seconds (Warm-up)
## Chain 2: 0.344753 seconds (Sampling)
## Chain 2: 0.76717 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.7e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 3: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.381535 seconds (Warm-up)
## Chain 3: 0.320845 seconds (Sampling)
## Chain 3: 0.70238 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.5e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 4: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.521842 seconds (Warm-up)
## Chain 4: 0.313846 seconds (Sampling)
## Chain 4: 0.835688 seconds (Total)
## Chain 4:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5:
## Chain 5: Gradient evaluation took 1.5e-05 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5:
## Chain 5:
## Chain 5: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 5: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 5:
## Chain 5: Elapsed Time: 0.3002 seconds (Warm-up)
## Chain 5: 0.320296 seconds (Sampling)
## Chain 5: 0.620496 seconds (Total)
## Chain 5:
mcmc_trace(bike_posterior, size = 0.1)
mcmc_dens_overlay(bike_posterior)
mcmc_acf(bike_posterior)
neff_ratio(bike_posterior)
## (Intercept) humidity sigma
## 0.87165 0.92065 1.06490
rhat(bike_posterior)
## (Intercept) humidity sigma
## 1.0000062 0.9999326 0.9998849
There’s no discernible trend or flat lines in the trace plots and the density plots are consistent across five chains. The autocorrelation quickly drops off and is effectively 0 by lag 5. The effective sample size ratios are around 1 and the R-hat values are very close to 1. All this indicates that the chains are stable, mixing quickly, and behaving much like an independent sample. So we can “trust” these simulation results.
# 100 posterior model lines
bikes %>%
add_fitted_draws (bike_posterior, n = 100) %>%
ggplot(aes(x = humidity, y = rides)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
Compared to the prior model lines, the posterior model lines indicates that the decrease in ridership with humidity appears to be a bit less steep than we had anticipated. Further, the posterior plausible models are far less variable, indicating that we’re far more confident about the relationship between ridership and humidity upon observing the data.
# Posterior summary statistics
tidy(bike_posterior, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.95)
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 4017. 224. 3580. 4459.
## 2 humidity -8.44 3.40 -15.0 -1.88
## 3 sigma 1573. 50.0 1480. 1676.
## 4 mean_PPD 3484. 99.8 3288. 3678.
The posterior median value of the σ parameter 1574 indicates that on average, we can expect the observed ridership on a given day to deviate 1574 rides from the average ridership on days of the same humidity.
The 95% posterior credible interval for the humidity coefficient β1 , (-15.01588, -1.752616), indicates that this slope could range anywhere between -15.01588 and -1.752616.
I think we have ample posterior evidence that there’s a negative association between ridership and humidity. In our visual examination of 100 posterior plausible scenarios for the relationship between ridership and humidity in Exercise 9.11 part c, they all exhibited negative associations. A line exhibiting no relationship or positive relationship (β1 = 0 or β1 > 0) would stand out. More rigorously, the 95% credible interval for β1 in the above tidy summary (-15.01588, -1.752616) lies entirely and well below 0.
We can also do a quick tabulation to approximate that there’s almost certainly a negative association, P (β1 < 0 | y) ≈ 1. Almost all of the 20000 Markov chain values of β1 are negative.
# Tabulate the beta_1 values that below 0
bike_posterior_df <- as.data.frame(bike_posterior)
bike_posterior_df %>%
mutate(below_0 = humidity < 0) %>%
tabyl(below_0)
## below_0 n percent
## FALSE 111 0.00555
## TRUE 19889 0.99445
# Predict rides for each parameter set in the chain
set.seed(84735)
predict_90 <- bike_posterior_df %>%
mutate(mu = `(Intercept)` + humidity*90,
y_new = rnorm(20000, mean = mu, sd = sigma))
head(predict_90, 3)
## (Intercept) humidity sigma mu y_new
## 1 3937.179 -4.579930 1594.279 3524.985 4588.741
## 2 4020.944 -5.210964 1591.189 3551.957 3354.904
## 3 4009.796 -6.991306 1528.442 3380.579 4574.090
The collection of 20,000 mu values approximates the posterior model for the typical ridership on 90% humidity days, μ = β0 + β1 ∗ 90, the 20,000 y_new values approximate the posterior predictive model of ridership for tomorrow, an individual 90% humidity day.
# Plot the posterior model of the typical ridership on 90% humidity days
ggplot(predict_90, aes(x = mu)) +
geom_density()
# Plot the posterior predictive model of tomorrow's ridership
ggplot(predict_90, aes(x = y_new)) +
geom_density()
Though they’re centered at roughly similar value (range), the posterior predictive model for mu is much narrower than that of y_new. The posterior model for μ merely captures the uncertainty in the average ridership on all X = 90% humidity days. Since there is so little uncertainty about this average, this interval visually appears like a wee dot. In contrast, the posterior predictive model for the number of rides tomorrow (a specific day) accounts for not only the average ridership on a 90% humidity day, but the individual variability from this average.
# Construct 80% posterior credible intervals
predict_90 %>%
summarize(lower_new = quantile(y_new, 0.1),
upper_new = quantile(y_new, 0.9))
## lower_new upper_new
## 1 1263.531 5283.417
The 80% posterior prediction interval for the number of rides tomorrow is (1261, 5293), indicating that there’s a 80% posterior probability that the number of rides tomorrow is somewhere between 1261 and 5293.
# Simulate a set of predictions
set.seed(84735)
shortcut_prediction <- posterior_predict(bike_posterior,
newdata = data.frame(humidity = 90))
# Construct a 80% posterior credible interval
posterior_interval(shortcut_prediction, prob = 0.80)
## 10% 90%
## 1 1263.531 5283.417
# Plot the approximate predictive model
mcmc_dens(shortcut_prediction) +
xlab("predicted ridership on a 90% humidity day")
Ridership tends to decrease as windspeed increases. Specifically, for every one percentage point increase in windspeed level, the ridership tends to decrease by 6 rides, though this average decrease could be anywhere between 0 and 12.
Ridership is only weakly related to windspeed. At any given windspeed, ridership will tend to vary with a large standard deviation of 1000 rides.
priors: β0c∼ N(5000, 2000^2)
β1∼ N(-6, 3^2)
σ ∼ Exp(0.001)
ridership_priors <- stan_glm(
rides ~ windspeed, data = bikes,
family = gaussian,
prior_intercept = normal(5000, 2000),
prior = normal(-6, 3),
prior_aux = exponential(0.001),
prior_PD = TRUE,
chains = 5, iter = 4000*2, seed = 2666)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 5.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.53 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 1: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.826717 seconds (Warm-up)
## Chain 1: 0.114093 seconds (Sampling)
## Chain 1: 0.94081 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 5.7e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.57 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 2: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.568038 seconds (Warm-up)
## Chain 2: 0.092604 seconds (Sampling)
## Chain 2: 0.660642 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 3: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.433391 seconds (Warm-up)
## Chain 3: 0.146432 seconds (Sampling)
## Chain 3: 0.579823 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.6e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 4: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.667522 seconds (Warm-up)
## Chain 4: 0.127029 seconds (Sampling)
## Chain 4: 0.794551 seconds (Total)
## Chain 4:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5:
## Chain 5: Gradient evaluation took 1.4e-05 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5:
## Chain 5:
## Chain 5: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 5: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 5:
## Chain 5: Elapsed Time: 0.610316 seconds (Warm-up)
## Chain 5: 0.126385 seconds (Sampling)
## Chain 5: 0.736701 seconds (Total)
## Chain 5:
# 100 prior model lines
bikes %>%
add_fitted_draws (ridership_priors, n = 100) %>%
ggplot(aes(x = windspeed, y = rides)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
# 4 prior simulated datasets
set.seed(6)
bikes %>%
add_predicted_draws(ridership_priors, n = 4) %>%
ggplot(aes(x = windspeed, y = rides)) +
geom_point(aes(y = .prediction, group = .draw)) +
facet_wrap(~ .draw)
## Warning:
## In add_predicted_draws(): The `n` argument is a deprecated alias for `ndraws`.
## Use the `ndraws` argument instead.
## See help("tidybayes-deprecated").
The 100 prior model lines capture my prior understanding that ridership decreases with windspeed. The ridership tends to be around 5000 on average windspeed days but could range from roughly 1000 to 9000. The variability in these lines adequately reflects my overall uncertainty about this association: Ridership is weakly associated with windspeed and ridership exhibits great variability (a standard deviation of 1000 rides) at any given windspeed. Ridership typically decreases by 6 rides with every one percentage point increase in windspeed level, though this average decrease could range from 0 to 12. The four ridership datasets simulated from the Normal data model using five prior plausible sets of (β0, β1, σ) values also shows consistent rate of decrease in ridership with windspeed, the baseline ridership, and the variability in ridership with my prior assumptions.
# Load and plot data
data(bikes)
ggplot(bikes, aes(x = windspeed, y = rides)) +
geom_point(size = 0.5, alpha = 0.5) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
The observed patterns and my prior understanding both indicate that ridership decreases with windspeed. However, the observed patterns indicates that the ridership tends to be around 3500 on average windspeed days and could range from roughly 0 to 7000. The variability in the lines from both the observed patterns and my prior understanding adequately reflects the overall uncertainty about this association: Ridership is relatively weakly associated with windspeed. Ridership in the observed patterns and my prior understanding exhibits similar variability at a given windspeed. While my prior understanding indicates that ridership typically decreases by 6 rides with every one percentage point increase in windspeed level and the average decrease ranges from 0 to 12, the observed patterns show a higher rate of decrease in ridership with windspeed.
# Perform a posterior simulation
ridership_posterior <- update(ridership_priors, prior_PD = FALSE)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 2.1e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 1: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 1: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 1: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 1: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 1: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 1: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 1: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 1: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 1: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 1: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 1: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.302538 seconds (Warm-up)
## Chain 1: 0.408742 seconds (Sampling)
## Chain 1: 0.71128 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 2: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 2: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 2: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 2: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 2: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 2: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 2: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 2: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 2: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 2: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 2: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.826359 seconds (Warm-up)
## Chain 2: 0.307363 seconds (Sampling)
## Chain 2: 1.13372 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.7e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 3: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 3: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 3: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 3: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 3: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 3: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 3: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 3: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 3: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 3: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 3: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.25603 seconds (Warm-up)
## Chain 3: 0.31461 seconds (Sampling)
## Chain 3: 0.57064 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.5e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 4: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 4: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 4: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 4: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 4: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 4: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 4: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 4: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 4: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 4: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 4: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.293606 seconds (Warm-up)
## Chain 4: 0.323636 seconds (Sampling)
## Chain 4: 0.617242 seconds (Total)
## Chain 4:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 5).
## Chain 5:
## Chain 5: Gradient evaluation took 1.8e-05 seconds
## Chain 5: 1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
## Chain 5: Adjust your expectations accordingly!
## Chain 5:
## Chain 5:
## Chain 5: Iteration: 1 / 8000 [ 0%] (Warmup)
## Chain 5: Iteration: 800 / 8000 [ 10%] (Warmup)
## Chain 5: Iteration: 1600 / 8000 [ 20%] (Warmup)
## Chain 5: Iteration: 2400 / 8000 [ 30%] (Warmup)
## Chain 5: Iteration: 3200 / 8000 [ 40%] (Warmup)
## Chain 5: Iteration: 4000 / 8000 [ 50%] (Warmup)
## Chain 5: Iteration: 4001 / 8000 [ 50%] (Sampling)
## Chain 5: Iteration: 4800 / 8000 [ 60%] (Sampling)
## Chain 5: Iteration: 5600 / 8000 [ 70%] (Sampling)
## Chain 5: Iteration: 6400 / 8000 [ 80%] (Sampling)
## Chain 5: Iteration: 7200 / 8000 [ 90%] (Sampling)
## Chain 5: Iteration: 8000 / 8000 [100%] (Sampling)
## Chain 5:
## Chain 5: Elapsed Time: 0.316843 seconds (Warm-up)
## Chain 5: 0.322666 seconds (Sampling)
## Chain 5: 0.639509 seconds (Total)
## Chain 5:
# MCMC diganostics
mcmc_trace(ridership_posterior, size = 0.1)
mcmc_dens_overlay(ridership_posterior)
mcmc_acf(ridership_posterior)
neff_ratio(ridership_posterior)
## (Intercept) windspeed sigma
## 1.00080 1.00585 0.97570
rhat(ridership_posterior)
## (Intercept) windspeed sigma
## 0.9999988 0.9999940 0.9999422
There’s no discernible trend or flat lines in the trace plots and the density plots are consistent across five chains. The autocorrelation quickly drops off and is effectively 0 by lag 5. The effective sample size ratios are around 1 and the R-hat values are very close to 1. All this indicates that the chains are stable, mixing quickly, and behaving much like an independent sample. So we can “trust” these simulation results.
# 100 posterior model lines
bikes %>%
add_fitted_draws (ridership_posterior, n = 100) %>%
ggplot(aes(x = windspeed, y = rides)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
Compared to the prior model lines, the posterior model lines indicates that the decrease in ridership with windspeed appears to be more steep than we had anticipated. Further, the posterior plausible models are far less variable, indicating that we’re far more confident about the relationship between ridership and windspeed upon observing the data.
# Posterior summary statistics
tidy(ridership_posterior, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.95)
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 3596. 79.3 3439. 3753.
## 2 windspeed -8.67 2.94 -14.3 -2.94
## 3 sigma 1566. 50.0 1476. 1670.
## 4 mean_PPD 3484. 99.7 3288. 3679.
The posterior median value of the σ parameter 1567 indicates that on average, we can expect the observed ridership on a given day to deviate 1567 rides from the average ridership on days of the same windspeed. The 95% posterior credible interval for the windspeed coefficient β1 , (-14.26132 , -2.965237), indicates that this slope could range anywhere between -14.26132 and -2.965237.
Therefore, I think we have ample posterior evidence that there’s a negative association between ridership and windspeed. In our visual examination of 100 posterior plausible scenarios for the relationship between ridership and windspeed above, they all exhibited negative associations. A line exhibiting no relationship or positive relationship (β1 = 0 or β1 > 0) would stand out. More rigorously, the 95% credible interval for β1 in the above tidy summary (-14.26132 , -2.965237) lies entirely and well below 0.
We can also do a quick tabulation to approximate that there’s almost certainly a negative association, P (β1 < 0 | y) ≈ 1. Almost all of the 20000 Markov chain values of β1 are negative.
# Tabulate the beta_1 values that below 0
ridership_posterior_df <- as.data.frame(ridership_posterior)
ridership_posterior_df %>%
mutate(below_0 = windspeed < 0) %>%
tabyl(below_0)
## below_0 n percent
## FALSE 32 0.0016
## TRUE 19968 0.9984
penguins_prior_default <- stan_glm(
flipper_length_mm ~ bill_length_mm, data = penguins_bayes,
family = gaussian,
prior_intercept = normal(200, 25, autoscale = TRUE),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = TRUE,
chains = 4, iter = 5000*2, seed = 84735)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.7e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.251659 seconds (Warm-up)
## Chain 1: 0.124617 seconds (Sampling)
## Chain 1: 0.376276 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.237545 seconds (Warm-up)
## Chain 2: 0.152474 seconds (Sampling)
## Chain 2: 0.390019 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.4e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.188564 seconds (Warm-up)
## Chain 3: 0.118156 seconds (Sampling)
## Chain 3: 0.30672 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.2e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.193617 seconds (Warm-up)
## Chain 4: 0.142563 seconds (Sampling)
## Chain 4: 0.33618 seconds (Total)
## Chain 4:
prior_summary(penguins_prior_default)
## Priors for model 'penguins_prior_default'
## ------
## Intercept (after predictors centered)
## Specified prior:
## ~ normal(location = 200, scale = 25)
## Adjusted prior:
## ~ normal(location = 200, scale = 352)
##
## Coefficients
## Specified prior:
## ~ normal(location = 0, scale = 2.5)
## Adjusted prior:
## ~ normal(location = 0, scale = 6.4)
##
## Auxiliary (sigma)
## Specified prior:
## ~ exponential(rate = 1)
## Adjusted prior:
## ~ exponential(rate = 0.071)
## ------
## See help('prior_summary.stanreg') for more details
data: Yi|β0,β1, σ ∼ N (μi, σ^2 ) with μi= β0+ β1X
priors: β0c∼ N(200, 352^2)
β1∼ N(0, 6.4^2)
σ ∼ Exp(0.071)
# 100 prior model lines
penguins_1 <- na.omit(penguins_bayes) %>%
add_fitted_draws(penguins_prior_default, n = 100) %>%
ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
penguins_1
# 4 prior simulated datasets
set.seed(6)
penguins_2 <- na.omit(penguins_bayes) %>%
add_predicted_draws(penguins_prior_default, ndraws = 4) %>%
ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_point(aes(y = .prediction, group = .draw)) +
facet_wrap(~ .draw)
penguins_2
# Load and plot data
data(penguins_bayes)
ggplot(penguins_bayes, aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_point(size = 0.5) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).
The observed data indicates that the relationship between flipper and bill length is that flipper length increases with bill length. The flipper length tends to be around 200 mm with an average bill length 44 mm but could range from roughly 170 to 230 mm. The variability from the observed patterns reflects some extent of overall uncertainty about the association, but flipper length is strongly associated with bill length and flipper length and exhibits small variability at any given bill length.
penguins_4 <- na.omit(penguins_bayes) %>%
summarise(mean_bill_length = mean(bill_length_mm),
mean_flipper_length = mean(flipper_length_mm),
std_dev = sd(flipper_length_mm))
penguins_4
## # A tibble: 1 × 3
## mean_bill_length mean_flipper_length std_dev
## <dbl> <dbl> <dbl>
## 1 44.0 201. 14.0
# Perform a posterior simulation
penguins_posterior <- update(penguins_prior_default, prior_PD = FALSE)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 2e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.217019 seconds (Warm-up)
## Chain 1: 0.424555 seconds (Sampling)
## Chain 1: 0.641574 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.2 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.176714 seconds (Warm-up)
## Chain 2: 0.332476 seconds (Sampling)
## Chain 2: 0.50919 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.7e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.17 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.17862 seconds (Warm-up)
## Chain 3: 0.333029 seconds (Sampling)
## Chain 3: 0.511649 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1.4e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.174408 seconds (Warm-up)
## Chain 4: 0.323218 seconds (Sampling)
## Chain 4: 0.497626 seconds (Total)
## Chain 4:
# 100 posterior model lines
penguins_3 <- na.omit(penguins_bayes) %>%
add_fitted_draws (penguins_posterior, n = 100) %>%
ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
geom_line(aes(y = .value, group = .draw), alpha = 0.15)
## Warning: `fitted_draws` and `add_fitted_draws` are deprecated as their names were confusing.
## Use [add_]epred_draws() to get the expectation of the posterior predictive.
## Use [add_]linpred_draws() to get the distribution of the linear predictor.
## For example, you used [add_]fitted_draws(..., scale = "response"), which
## means you most likely want [add_]epred_draws(...).
penguins_3
# Posterior summary statistics
tidy(penguins_posterior, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.9)
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 127. 4.76 119. 134.
## 2 bill_length_mm 1.69 0.108 1.52 1.87
## 3 sigma 10.6 0.408 10.0 11.3
## 4 mean_PPD 201. 0.809 200. 202.
The 90% posterior credible interval for the bill length coefficient β1 , (1.516686, 1.86384), indicates that this slope could range anywhere between 1.516686 and 1.86384.
I think we have ample posterior evidence that there’s a positive association between flipper length and bill length. In our visual examination of 100 posterior plausible scenarios for the relationship between flipper length and bill length in part b, they all exhibited positive associations. A line exhibiting no relationship or negative relationship (β1 = 0 or β1 < 0) would stand out. More rigorously, the 90% credible interval for β1 in the above tidy summary (1.516686, 1.86384) lies entirely and well above 0.
We can also do a quick tabulation to approximate that there’s almost certainly a positive association, P (β1 > 0 | y) ≈ 1. All of the 20000 Markov chain values of β1 are positive.
# Tabulate the beta_1 values that above 0
penguins_posterior_df <- as.data.frame(penguins_posterior)
penguins_posterior_df %>%
mutate(above_0 = bill_length_mm > 0) %>%
tabyl(above_0)
## above_0 n percent
## TRUE 20000 1
# Predict flipper length for each parameter set in the chain
set.seed(84735)
predict_51 <- penguins_posterior_df %>%
mutate(mu = `(Intercept)` + bill_length_mm*51,
y_new = rnorm(20000, mean = mu, sd = sigma))
head(predict_51, 3)
## (Intercept) bill_length_mm sigma mu y_new
## 1 119.0921 1.848526 10.52732 213.3669 220.3911
## 2 138.4460 1.426936 10.44402 211.2197 209.9264
## 3 118.7147 1.855484 11.04049 213.3444 221.9655
The collection of 20,000 mu values approximates the posterior model for the typical flipper length among penguins with 51mm bills, μ = β0 + β1 ∗ 51, the 20,000 y_new values approximate the posterior predictive model for Pablo’s flipper length, an individual penguin whose bill is 51mm long.
# Plot the posterior model of the typical flipper length among penguins with 51mm bills
ggplot(predict_51, aes(x = mu)) +
geom_density()
# Plot the posterior predictive model for Pablo’s flipper length
ggplot(predict_51, aes(x = y_new)) +
geom_density()
Though they’re centered at roughly similar value (range), the posterior predictive model for mu is narrower than that of y_new. The posterior model for μ merely captures the uncertainty in the average flipper length among all penguins with X = 51mm bills. Since there is so little uncertainty about this average, this interval visually appears like a wee dot. In contrast, the posterior predictive model for Pablo’s flipper length (a specific penguin) accounts for not only the average flipper length among penguins with 51mm bills, but the individual variability from this average.
# Construct 80% posterior credible intervals
predict_51 %>%
summarize(lower_new = quantile(y_new, 0.1),
upper_new = quantile(y_new, 0.9))
## lower_new upper_new
## 1 199.3957 226.6169
The 80% posterior prediction interval for Pablo’s flipper length is (199, 227), indicating that there’s a 80% posterior probability that Pablo’s flipper length is somewhere between 199 and 227 mm.
# Construct 80% posterior credible intervals
predict_51 %>%
summarize(lower_mu = quantile(mu, 0.1),
upper_mu = quantile(mu, 0.9))
## lower_mu upper_mu
## 1 211.6749 214.0955
The 80% credible interval for the typical flipper length among all penguins with 51mm bills (212, 214) is narrower than the 80% posterior prediction interval for Pablo’s flipper length since the posterior model for μ merely captures the uncertainty in the average flipper length among all penguins with X = 51mm bills and there is so little uncertainty about this average, leading to a small range of posterior plausible values concentrated along the credible interval. In contrast, the posterior predictive model for Pablo’s flipper length (a specific penguin) accounts for not only the average flipper length among penguins with 51mm bills, but the individual variability from this average, thus exhibiting more variability and leading to a larger range of posterior plausible values (wider credible interval). So with the same 80% credible level, the 80% credible interval for the typical flipper length among all penguins with 51mm bills is narrower.
# Simulate a set of predictions
set.seed(84735)
shortcut_prediction_1 <- posterior_predict(penguins_posterior,
newdata = data.frame(bill_length_mm = 51))
# Construct a 80% posterior credible interval
posterior_interval(shortcut_prediction_1, prob = 0.80)
## 10% 90%
## 1 199.3957 226.6169
# Plot the approximate predictive model
mcmc_dens(shortcut_prediction_1) +
xlab("predicted flipper length among penguins with 51mm bills")
Researchers’ prior understanding of the relationship between flipper length and body mass is that flipper length increases with body mass. They believe the flipper length typically increases by 0.01 mm with every one g increase in body mass, though this average increase could range from 0.006 to 0.014.
# Load and plot data
data(penguins_bayes)
ggplot(penguins_bayes, aes(x = body_mass_g, y = flipper_length_mm)) +
geom_point(size = 0.5) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).
The observed data indicates that the relationship between flipper and body mass is that flipper length increases with body mass. The flipper length tends to be around 200 with an average body mass 4207 g but could range from roughly 170 to 230. The variability from the observed patterns reflects some extent of overall uncertainty about the association, yet flipper length is strongly associated with body mass and flipper length exhibits small variability at each given body mass.
penguins_5 <- na.omit(penguins_bayes) %>%
summarise(mean_body_mass_g = mean(body_mass_g),
mean_flipper_length = mean(flipper_length_mm),
std_dev = sd(flipper_length_mm))
penguins_5
## # A tibble: 1 × 3
## mean_body_mass_g mean_flipper_length std_dev
## <dbl> <dbl> <dbl>
## 1 4207. 201. 14.0
I think in a simple Normal regression model of flipper length Y by one predictor X, σ parameter is bigger when X = bill length. By comparing the plot of observed relationship between flipper_length_mm and body_mass_g above and the plot of observed relationship between flipper_length_mm and bill_length_mm in exercise 9.17 part a, observations in the plot of observed relationship between flipper_length_mm and bill_length_mm overall deviate more from the model line. Since σ parameter indicates the degree to which observations deviate from the model line, σ parameter is bigger when X = bill length than when X = body mass.
penguins_posterior_default <- stan_glm(
flipper_length_mm ~ body_mass_g, data = penguins_bayes,
family = gaussian,
prior_intercept = normal(200, 15, autoscale = TRUE),
prior = normal(0.01, 0.002, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 84735)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 2.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.23 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.204791 seconds (Warm-up)
## Chain 1: 0.366124 seconds (Sampling)
## Chain 1: 0.570915 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 2.5e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.241422 seconds (Warm-up)
## Chain 2: 0.34802 seconds (Sampling)
## Chain 2: 0.589442 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 2.1e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.192484 seconds (Warm-up)
## Chain 3: 0.338211 seconds (Sampling)
## Chain 3: 0.530695 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 2.1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.185012 seconds (Warm-up)
## Chain 4: 0.326917 seconds (Sampling)
## Chain 4: 0.511929 seconds (Total)
## Chain 4:
# Store the 4 chains for each parameter in 1 data frame
penguins_posterior_default_df <- as.data.frame(penguins_posterior_default)
# Plot the posterior model of the β1 body_mass_g coefficient
ggplot(penguins_posterior_default_df, aes(x = body_mass_g)) +
geom_density()
The researchers’ posterior understanding of the relationship between flipper length and mass is that flipper length increases with body mass. The flipper length tends to be around 159 with average body mass but could range from 1.579005e+02 to 160. The variability from the observed patterns reflects some extent of overall uncertainty about the association. The posterior model indicates relative posterior certainty about the strength in the relationship between Flipper length and mass. We can see flipper length is strongly associated with body mass and flipper length exhibits moderate variability at each given body mass (on average, we can expect the observed flipper length of a penguin of a given mass to deviate 0.1234 mm from the average flipper length of a given pengui with the same mass). Flipper length typically increases by 0.01002171 mm with every one g increase in body mass, though this average increase could range from 9.953034e-03 to 0.0100905. Compared to researchers’ prior understanding, the typical flipper length with average body mass is shorter and has a much smaller standard deviation, the average increase in flipper length grows a bit and they become more certain in its range, the variability in flipper length with same body mass is smaller based on researchers’ posterior understanding. Overall, researchers are far more confident about the relationship between flipper length and body mass upon observing some data.
tidy(penguins_posterior_default, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.95)
## # A tibble: 4 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 159. 0.468 158. 160.
## 2 body_mass_g 0.0100 0.0000351 0.00995 0.0101
## 3 sigma 8.10 0.311 7.53 8.75
## 4 mean_PPD 201. 0.627 200. 202.
penguins_prior_default_1 <- update(penguins_posterior_default, prior_PD = TRUE)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.5e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.19149 seconds (Warm-up)
## Chain 1: 0.146112 seconds (Sampling)
## Chain 1: 0.337602 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.3e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.184458 seconds (Warm-up)
## Chain 2: 0.214293 seconds (Sampling)
## Chain 2: 0.398751 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1.1e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.184018 seconds (Warm-up)
## Chain 3: 0.216427 seconds (Sampling)
## Chain 3: 0.400445 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.176621 seconds (Warm-up)
## Chain 4: 0.217182 seconds (Sampling)
## Chain 4: 0.393803 seconds (Total)
## Chain 4:
prior_summary(penguins_prior_default_1)
## Priors for model 'penguins_prior_default_1'
## ------
## Intercept (after predictors centered)
## Specified prior:
## ~ normal(location = 200, scale = 15)
## Adjusted prior:
## ~ normal(location = 200, scale = 211)
##
## Coefficients
## Specified prior:
## ~ normal(location = 0.01, scale = 0.002)
## Adjusted prior:
## ~ normal(location = 0.01, scale = 3.5e-05)
##
## Auxiliary (sigma)
## Specified prior:
## ~ exponential(rate = 1)
## Adjusted prior:
## ~ exponential(rate = 0.071)
## ------
## See help('prior_summary.stanreg') for more details